Enriching a Portuguese WordNet using Synonyms from a Monolingual Dictionary

نویسندگان

  • Alberto Simões
  • Xavier Gómez Guinovart
  • José João Almeida
چکیده

In this article we present an exploratory approach to enrich a WordNet-like lexical ontology with the synonyms present in a standard monolingual Portuguese dictionary. The dictionary was converted from PDF into XML and senses were automatically identified and annotated. This allowed us to extract them, independently of definitions, and to create sets of synonyms (synsets). These synsets were then aligned with WordNet synsets, both in the same language (Portuguese) and projecting the Portuguese terms into English, Spanish and Galician. This process allowed both the addition of new term variants to existing synsets, as to create new synsets for Portuguese.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

Processing and extracting data from an open dictionary of the Portuguese language

Synonyms dictionaries are useful resources for natural language processing. Unfortunately their availability in digital format is limited, as publishing companies do not release their dictionaries in open digital formats. Dicionário-Aberto (Simões and Farinha, 2010) is an open and free digital synonyms dictionary for the Portuguese language. It is under public domain and in textual digital form...

متن کامل

Enriching Slovene WordNet with domain-specific terms

The paper describes an innovative approach to expanding the domain coverage of wordnet by exploiting multiple resources. In the experiment described here we are using a large monolingual Slovene corpus of texts from the domain of informatics to harvest terminology from, and a parallel English-Slovene corpus and an online dictionary as bilingual resources to facilitate the mapping of terms to th...

متن کامل

Dictionary Alignment by Rewrite-based Entry Translation

In this document we describe the process of aligning two standard monolingual dictionaries: a Portuguese language dictionary and a Galician synonym dictionary. The main goal of the project is to provide an online dictionary that can show, in parallel, definitions and synonyms in Portuguese and Galician for a specific word, written in Portuguese or Galician. These two languages are very close to...

متن کامل

Monolingual and bilingual dictionary approaches to the enrichment of the Spanish WordNet with adjectives

We report on two different approaches to the incorporation of adjectives in Spanish WordNet based on automatic extraction techniques using EuroWordNet and machine-readable dictionaries. We show that a monolingual dictionary approach enables to exploit relations between different parts of speech and enrich the internal structure of the Spanish WordNet, while the methods based on bilingual dictio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016